Background: Comprehensive protein-protein interaction (PPI) maps are a powerful resource for uncovering the\r\nmolecular basis of genetic interactions and providing mechanistic insights. Over the past decade, high-throughput\r\nexperimental techniques have been developed to generate PPI maps at proteome scale, first using yeast two-hybrid\r\napproaches and more recently via affinity purification combined with mass spectrometry (AP-MS). Unfortunately, data\r\nfrom both protocols are prone to both high false positive and false negative rates. To address these issues, many\r\nmethods have been developed to post-process raw PPI data. However, with few exceptions, these methods only\r\nanalyze binary experimental data (in which each potential interaction tested is deemed either observed or\r\nunobserved), neglecting quantitative information available from AP-MS such as spectral counts.\r\nResults: We propose a novel method for incorporating quantitative information from AP-MS data into existing PPI\r\ninference methods that analyze binary interaction data. Our approach introduces a probabilistic framework that\r\nmodels the statistical noise inherent in observations of co-purifications. Using a sampling-based approach, we model\r\nthe uncertainty of interactions with low spectral counts by generating an ensemble of possible alternative\r\nexperimental outcomes. We then apply the existing method of choice to each alternative outcome and aggregate\r\nresults over the ensemble. We validate our approach on three recent AP-MS data sets and demonstrate performance\r\ncomparable to or better than state-of-the-art methods. Additionally, we provide an in-depth discussion comparing\r\nthe theoretical bases of existing approaches and identify common aspects that may be key to their performance.\r\nConclusions: Our sampling framework extends the existing body of work on PPI analysis using binary interaction\r\ndata to apply to the richer quantitative data now commonly available through AP-MS assays. This framework is quite\r\ngeneral, and many enhancements are likely possible. Fruitful future directions may include investigating more\r\nsophisticated schemes for converting spectral counts to probabilities and applying the framework to direct protein\r\ncomplex prediction methods.
Loading....